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It can be surprising to naive observers that statistics from different research studies 
concerning the same issue can produce very dissimilar or contradictory results. In order 
to resolve this paradox, many people conclude that statistics are not actually reliable 
indicators of reality. Those versed in statistics, however, understand that statistics rely 
on assumptions. Proceeding from different assumptions, or claiming assumptions which 
don't apply to the research situation in question can lead to divergent results. This 
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digest attempts to warn against the frequent misuses and abuses of statistics. Although 
these issues are familiar to most statisticians, they can be easily overlooked. These 
problems can be considered in three broad classes of statistical pitfalls: sources of bias, 
errors in methodology, and misinterpretation of results. 

SOURCES OF BIAS 



Statistical methodology assists researchers in making inferences about a large group (a 
population) based on observations of a smaller subset of that group (a sample). 

Sources of bias are conditions or circumstances which affect the external validity of 
statistical results. Thus, in order for a researcher to make legitimate conclusions about 
the specified population, two characteristics must be present within the sample: 
representative sampling and valid statistical assumptions. 

Representative sampling is one of the most fundamental principles of inferential 
statistics. This type of sampling implies that the observed group has similar 
characteristics to the target population in all areas that are relevant to the study. 
Representative sampling is necessary to make valid inferences made about the target 
population. Unfortunately, representative sampling can be difficult to achieve. The ideal 
sample is chosen by selecting members of the population at random, with each member 
having an equal probability of being selected for the sample. When randomization is not 
possible, researchers usually try to choose a sample in which their group of subjects 
"parallels" the population with respect to the characteristics that are thought to be 
important to the particular investigation. 

Statistical assumptions made about various aspects of the problem determine the 
statistical procedure's validity. This means that certain aspects of the measured 
variables must conform to assumptions which underlie the statistical procedures to be 
applied. For example, well-known linear methods such as analysis of variance (ANOVA) 
depend on the assumptions of normality and independence. While assumption of 
normality implies that the scores in each treatment group are distributed in a way that 
corresponds to the so-called "normal" (or Gaussian) distribution, the assumption of 
independence indicates that each of the subject's scores are uninfluenced by the scores 
of anyone else who was tested. 

ERRORS IN METHODOLOGY 



There are a number of ways that statistical techniques can be misapplied to problems in 
the real world. These types of errors can lead to invalid or inaccurate results. Three of 
the most common hazards are designing experiments with insufficient statistical power, 
ignoring measurement error, and performing multiple comparisons. 

Two types of errors can occur when making inferences based on a statistical hypothesis 
test: a Type I error happens if the null hypothesis is rejected when it should not be (the 
probability of this is called "alpha"); and a Type II error results from the failure to reject a 
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null hypothesis when you should (the probability of this is called "beta"). Statistical 
power refers to the probability of avoiding a Type II error and depends on the ability of 
one's statistical test to detect true differences of a particular size. The power of the test 
generally depends on four things: the sample size, the desired detectable effect size, 
the specified Type I error rate, and the variability of the sample. Based on these 
parameters, the power level of the experiment can be calculated. Nevertheless, the 
researcher can also specify the desired power level (e.g. .80), the Type I error level, and 
the minimum effect size which would be considered "interesting." (See Cohen, 1988, for 
more details on power analysis.). 

If there is little statistical power, a researcher risks overlooking the effect which he/she is 
attempting to discover. This is especially important if one intends to make inferences 
based on a finding of no difference. However, it should be noted that it is possible to 
have too much statistical power. If the sample is too large, nearly any difference, no 
matter how small or meaningless from a practical standpoint, will be "statistically 
significant." This occurrence can be particularly problematic in applied settings, where 
important decisions are determined by statistical results. 

Studying the relationship of multiple variables is especially troublesome because the 
desired knowledge is complex in nature and many different combinations of factors 
need to be examined. The best strategy to check these different combinations of factors 
is to rerun the experiment and see which comparisons show differences in both groups 
(also known as replication). Although this method is not irrefutable, it should provide a 
good notion of which effects are real and which are not. If replication is not a possibility, 
cross-validation--a technique which involves setting aside part of the sample as a 
validation sample-can also be helpful. In this system, the statistics of interest are 
computed on the main sample and are checked against the validation sample to verify 
that the effects are real. Using this technique, results that are spurious will usually be 
revealed by the validation sample. 

Most statistical models assume error free measurement, at least of independent 
(predictor) variables. However, measurements are seldom perfect. Therefore, close 
attention must be paid to the effects of measurement errors. This is especially important 
when dealing with noisy data such as questionnaire responses or processes which are 
difficult to measure precisely. 

Methods are available for taking measurement error into account in some statistical 
models. In particular, structural equation modeling allows one to specify relationships 
between "indicators," or measurement tools, and the underlying latent variables being 
measured, in the context of a linear path model. For more information on structural 
equation modeling and its uses, see Bollen (1989). 

PROBLEMS WITH INTERPRETATION 
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In addition to difficulties with bias and methodology, there are a number of problems 
which can arise in the context of substantive interpretation as well. These problems 
usually involve determining the significance of certain findings, avoiding confusion 
between precision and accuracy, and unraveling the causal relationships among 
variables. 

The difference between "significance" in the statistical sense and "significance" in the 
practical sense continues to elude many consumers of statistical results. Significance 
(in the statistical sense) is really as much a function of sample size and experimental 
design as it is a function of strength of relationship. With low power, a researcher may 
overlook a useful relationship; with excessive power, one may find microscopic effects 
that have no real practical value. A reasonable way to handle this sort of thing is to cast 
results in terms of effect sizes (see Cohen, 1994)-that way the size of the effect is 
presented in terms that make quantitative sense. 

Precision and Accuracy are two concepts which seem to get confused frequently. It's a 
subtle but important distinction: precision refers to how finely an estimate is specified, 
whereas accuracy refers to how close an estimate is to the true value. Estimates can be 
precise without being accurate, a fact often glossed over when interpreting computer 
output containing results specified to the fourth or sixth or eighth decimal place. 
Therefore, one should not report any more decimal places than he/she is fairly confident 
of reflecting something meaningful. 

Assessing causality is the reason of most statistical analysis, yet its subtleties escape 
many statistical consumers. For one to determine a causal inference, he/she must have 
random assignment. That is, the experimenter must be the one assigning values of 
predictor variables to cases. If the values are not assigned or manipulated, the most 
one can hope for is to show evidence of a relationship of some kind. Observational 
studies are very limited in their ability to illuminate causal relationships. 

Now, of course, many of the things that are of interest to study are not subject to 
experimental manipulation (e.g. health problems/risk factors). In order to understand 
them in a causal framework, a multifaceted approach to the research (you might think of 
it as "conceptual triangulation"), the use of chronologically structured designs (placing 
variables in the roles of antecedents and consequents), and plenty of replication is 
required to come to any strong conclusions regarding causality. 

SUMMARY 



In this paper, some of the trickier aspects of applied data analysis have been discussed. 
In future research or data analysis, people should be certain of the following: 

The sample is representative of the population of interest. 

The right amount of power should be included. 
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The best available measurement tools should be used. If there are errors in the 
measures, that fact must be taken into account when interpreting the results. 

Multiple comparisons need to be watched closely. If many test need to be done, 
replication or cross-validation should be used to verify the results. 

The objective of the study should remain the focus when interpreting the data. 
Therefore, magnitudes rather than p-values should be studied so that one isn't seduced 
by "stars in the tables." 

Numerical notation should be used in a rational way. This will help to avoid confusion 
between precision and accuracy. 

The conditions for causal inference should be understood. 

If causal inference must be made, random assignment should be use. In the absence of 
random assignment, much effort will be needed to uncovering causal relationships, 
requiring a variety of approaches to the question. 

Although errors and misconceptions about statistical information are difficult to avoid, 
one can use the above suggestions to help present the information in the clearest way 
possible. 

Mr. Heiberg can be reached at SPSS, Inc., 444 N. Michigan Avenue, Chicago, IL 
6061 1 ; or via e-mail at chelberg@spss.com 
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